A parallel pattern for iterative stencil + reduce

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel visual data restoration on multi-GPGPUs using stencil-reduce pattern

In this paper, a highly effective parallel filter for visual data restoration is presented. The filter is designed following a skeletal approach, using a newly proposed stencil-reduce, and has been implemented by way of the FastFlow parallel programming library. As a result of its high-level design, it is possible to run the filter seamlessly on a multicore machine, on multi-GPGPUs, or on both....

متن کامل

Optimizing Data - Parallel Stencil

We have developed a communication optimizer that concentrates on stencil communication patterns. This optimizer has been done in the context of the UNH C* compiler that targets distributed-memory MIMD computers. Our work has two distinguishing features: The compiler/optimizer is designed to be highly portable. We achieve this goal by providing eecient support for the optimizations in the run-ti...

متن کامل

Stencil-Aware GPU Optimization of Iterative Solvers

Numerical solutions of nonlinear partial differential equations frequently rely on iterative Newton-Krylov methods, which linearize a finite-difference stencil-based discretization of a problem, producing a sparse matrix with regular structure. Knowledge of this structure can be used to exploit parallelism and locality of reference on modern cache-based multiand manycore architectures, achievin...

متن کامل

Distributed Dynamic Load Balancing for Iterative-Stencil Applications

In the context of jobs executed on heterogeneous clusters or Grids, load balancing is essential. Indeed, a slow machine must receive less work than a faster one otherwise the overall job termination will be delayed. This is particularly true for Iterative-Stencil Applications where tasks are run simultaneously and are interdependent. The problem of assigning coexisting tasks to machines is call...

متن کامل

Efficient multicore-aware parallelization strategies for iterative stencil computations

Stencil computations consume a major part of runtime in many scientific simulation codes. As prototypes for this class of algorithms we consider the iterative Jacobi and Gauss-Seidel smoothers and aim at highly efficient parallel implementations for cachebased multicore architectures. Temporal cache blocking is a known advanced optimization technique, which can reduce the pressure on the memory...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: The Journal of Supercomputing

سال: 2016

ISSN: 0920-8542,1573-0484

DOI: 10.1007/s11227-016-1871-z